225 research outputs found
Mask-Predict: Parallel Decoding of Conditional Masked Language Models
Most machine translation systems generate text autoregressively from left to
right. We, instead, use a masked language modeling objective to train a model
to predict any subset of the target words, conditioned on both the input text
and a partially masked target translation. This approach allows for efficient
iterative decoding, where we first predict all of the target words
non-autoregressively, and then repeatedly mask out and regenerate the subset of
words that the model is least confident about. By applying this strategy for a
constant number of iterations, our model improves state-of-the-art performance
levels for non-autoregressive and parallel decoding translation models by over
4 BLEU on average. It is also able to reach within about 1 BLEU point of a
typical left-to-right transformer model, while decoding significantly faster.Comment: EMNLP 201
Zero-Shot Relation Extraction via Reading Comprehension
We show that relation extraction can be reduced to answering simple reading
comprehension questions, by associating one or more natural-language questions
with each relation slot. This reduction has several advantages: we can (1)
learn relation-extraction models by extending recent neural
reading-comprehension techniques, (2) build very large training sets for those
models by combining relation-specific crowd-sourced questions with distant
supervision, and even (3) do zero-shot learning by extracting new relation
types that are only specified at test-time, for which we have no labeled
training examples. Experiments on a Wikipedia slot-filling task demonstrate
that the approach can generalize to new questions for known relation types with
high accuracy, and that zero-shot generalization to unseen relation types is
possible, at lower accuracy levels, setting the bar for future work on this
task.Comment: CoNLL 201
- …